首页> 外文OA文献 >Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
【2h】

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data

机译:深刻的构图字幕:描述新的对象类别   没有配对培训数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

While recent deep neural network models have achieved promising results onthe image captioning task, they rely largely on the availability of corporawith paired image and sentence captions to describe objects in context. In thiswork, we propose the Deep Compositional Captioner (DCC) to address the task ofgenerating descriptions of novel objects which are not present in pairedimage-sentence datasets. Our method achieves this by leveraging large objectrecognition datasets and external text corpora and by transferring knowledgebetween semantically similar concepts. Current deep caption models can onlydescribe objects contained in paired image-sentence corpora, despite the factthat they are pre-trained with large object recognition datasets, namelyImageNet. In contrast, our model can compose sentences that describe novelobjects and their interactions with other objects. We demonstrate our model'sability to describe novel concepts by empirically evaluating its performance onMSCOCO and show qualitative results on ImageNet images of objects for which nopaired image-caption data exist. Further, we extend our approach to generatedescriptions of objects in video clips. Our results show that DCC has distinctadvantages over existing image and video captioning approaches for generatingdescriptions of new objects in context.
机译:尽管最近的深度神经网络模型在图像字幕任务上取得了可喜的成果,但它们很大程度上依赖于具有配对图像和句子标题的语料库来描述上下文中的对象。在这项工作中,我们提出了深度合成字幕机(DCC),以解决生成在成对图像句子数据集中不存在的新颖对象的描述的任务。我们的方法通过利用大型对象识别数据集和外部文本语料库以及在语义相似的概念之间转移知识来实现​​这一目标。当前的深字幕模型只能描述成对的图像句子语料库中包含的对象,尽管它们已通过大型对象识别数据集即ImageNet进行了预训练。相比之下,我们的模型可以组成描述新颖对象及其与其他对象的相互作用的句子。我们通过经验评估模型在MSCOCO上的性能,证明了模型描述新颖概念的能力,并在不存在配对图像说明数据的对象的ImageNet图像上显示了定性结果。此外,我们将方法扩展到视频剪辑中对象的生成描述。我们的结果表明,DCC与现有的图像和视频字幕方法相比具有明显的优势,可以在上下文中生成新对象的描述。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号